8 research outputs found
Survey on Vision-based Path Prediction
Path prediction is a fundamental task for estimating how pedestrians or
vehicles are going to move in a scene. Because path prediction as a task of
computer vision uses video as input, various information used for prediction,
such as the environment surrounding the target and the internal state of the
target, need to be estimated from the video in addition to predicting paths.
Many prediction approaches that include understanding the environment and the
internal state have been proposed. In this survey, we systematically summarize
methods of path prediction that take video as input and and extract features
from the video. Moreover, we introduce datasets used to evaluate path
prediction methods quantitatively.Comment: DAPI 201
CAR-Net: Clairvoyant Attentive Recurrent Network
We present an interpretable framework for path prediction that leverages
dependencies between agents' behaviors and their spatial navigation
environment. We exploit two sources of information: the past motion trajectory
of the agent of interest and a wide top-view image of the navigation scene. We
propose a Clairvoyant Attentive Recurrent Network (CAR-Net) that learns where
to look in a large image of the scene when solving the path prediction task.
Our method can attend to any area, or combination of areas, within the raw
image (e.g., road intersections) when predicting the trajectory of the agent.
This allows us to visualize fine-grained semantic elements of navigation scenes
that influence the prediction of trajectories. To study the impact of space on
agents' trajectories, we build a new dataset made of top-view images of
hundreds of scenes (Formula One racing tracks) where agents' behaviors are
heavily influenced by known areas in the images (e.g., upcoming turns). CAR-Net
successfully attends to these salient regions. Additionally, CAR-Net reaches
state-of-the-art accuracy on the standard trajectory forecasting benchmark,
Stanford Drone Dataset (SDD). Finally, we show CAR-Net's ability to generalize
to unseen scenes.Comment: The 2nd and 3rd authors contributed equall
Vehicle Trajectories from Unlabeled Data through Iterative Plane Registration
One of the most complex aspects of autonomous driving concerns understanding the surrounding environment. In particular, the interest falls on detecting which agents are populating it and how they are moving. The capacity to predict how these may act in the near future would allow an autonomous vehicle to safely plan its trajectory, minimizing the risks for itself and others. In this work we propose an automatic trajectory annotation method exploiting an Iterative Plane Registration algorithm based on homographies and semantic segmentations. The output of our technique is a set of holistic trajectories (past-present-future) paired with a single image context, useful to train a predictive model
Social Navigation with Human Empowerment Driven Deep Reinforcement Learning
Mobile robot navigation has seen extensive research in the last decades. The
aspect of collaboration with robots and humans sharing workspaces will become
increasingly important in the future. Therefore, the next generation of mobile
robots needs to be socially-compliant to be accepted by their human
collaborators. However, a formal definition of compliance is not
straightforward. On the other hand, empowerment has been used by artificial
agents to learn complicated and generalized actions and also has been shown to
be a good model for biological behaviors. In this paper, we go beyond the
approach of classical \acf{RL} and provide our agent with intrinsic motivation
using empowerment. In contrast to self-empowerment, a robot employing our
approach strives for the empowerment of people in its environment, so they are
not disturbed by the robot's presence and motion. In our experiments, we show
that our approach has a positive influence on humans, as it minimizes its
distance to humans and thus decreases human travel time while moving
efficiently towards its own goal. An interactive user-study shows that our
method is considered more social than other state-of-the-art approaches by the
participants
MASON: A Model AgnoStic ObjectNess Framework
This paper proposes a simple, yet very effective method to localize dominant foreground objects in an image, to pixel-level precision. The proposed method ‘MASON’ (Model-AgnoStic ObjectNess) uses a deep convolutional network to generate category-independent and model-agnostic heat maps for any image. The network is not explicitly trained for the task, and hence, can be used off-the-shelf in tandem with any other network or task. We show that this framework scales to a wide variety of images, and illustrate the effectiveness of MASON in three varied application contexts
RED: A simple but effective Baseline Predictor for the TrajNet Benchmark
In recent years, there is a shift from modeling the tracking problem based on Bayesian formulation towards using deep neural networks. Towards this end, in this paper the effectiveness of various deep neural networks for predicting future pedestrian paths are evaluated. The analyzed deep networks solely rely, like in the traditional approaches, on observed tracklets without human-human interaction information. The evaluation is done on the publicly available TrajNet benchmark dataset [39], which builds up a repository of considerable and popular datasets for trajectory prediction. We show how a Recurrent-Encoder with a Dense layer stacked on top, referred to as RED-predictor, is able to achieve top-rank at the TrajNet 2018 challenge compared to elaborated models. Further, we investigate failure cases and give explanations for observed phenomena, and give some recommendations for overcoming demonstrated shortcomings